AITopics | vowpal wabbit

Collaborating Authors

vowpal wabbit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online Learning Guide with Text Classification using Vowpal Wabbit (VW)

@machinelearnbotJan-18-2018, 04:55:14 GMT

A large number of E-Commerce and tech companies rely on real time training and predictions for their products. Google predicts real time click-through rates for their ads. This is used as an input to their auction mechanism, apart from a bid from the advertiser to decide which ads to show to the user. Stackoverflow uses real time predictions to automatically tag a question with the correct programming language so that they reach the right asker. An election management team might want to predict real time sentiment using Twitter to assess the impact of their campaign.

machine learning, natural language, vowpal wabbit, (14 more...)

@machinelearnbot

Industry:

Information Technology > Services (0.55)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Communications > Social Media (0.89)
Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.41)

Add feedback

Using Python Subprocess To Drive Machine Learning Packages (IT Best Kept Secret Is Optimization)

#artificialintelligenceDec-1-2016, 10:25:31 GMT

A lot of state of the art machine learning algorithms are available via open source software. Many open source software are designed to be used via a command line interface. I much prefer to use Python as I can mix many packages together, and I can use a combination of Numpy, Pandas, and Scikit-Learn to orchestrate my machine learning pipelines. I am not alone, and as a result, many open source machine learning software provide a Python api. For instance Vowpal Wabbit does not support a Python API that works with Anaconda.

artificial intelligence, machine learning, subprocess, (12 more...)

#artificialintelligence

Industry: Education (0.71)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast Recommendations for Activity Streams Using Vowpal Wabbit

#artificialintelligenceJul-16-2016, 20:40:48 GMT

The problem of content discovery and recommendation is very common in many machine learning applications: social networks, news aggregators and search engines are constantly updating and tweaking their algorithms to give individual users a unique experience. Personalization engines suggest relevant content with the objective of maximizing a specific metric. For example: a news website might want to increase the number of clicks in a session; on the other hand, for an e-commerce app it is very important to identify visitors that are more likely to buy a product in order to target them with special offers. In this post I will explore some techniques that can be used to generate recommendations and predictions using the amazingly fast Vowpal Wabbit library. Make sure that you have installed scikit-learn and Vowpal Wabbit's Prerequisite Software.

information retrieval, machine learning, vowpal wabbit, (14 more...)

#artificialintelligence

Industry: Information Technology > Services (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.75)
Information Technology > Artificial Intelligence > Machine Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)

Add feedback

Statistically adaptive learning for a general class of cost functions (SA L-BFGS)

Purpura, Stephen, Hillard, Dustin, Hubenthal, Mark, Walsh, Jim, Golder, Scott, Smith, Scott

arXiv.org Machine LearningSep-5-2012

We present a system that enables rapid model experimentation for tera-scale machine learning with trillions of non-zero features, billions of training examples, and millions of parameters. Our contribution to the literature is a new method (SA L-BFGS) for changing batch L-BFGS to perform in near real-time by using statistical tools to balance the contributions of previous weights, old training examples, and new training examples to achieve fast convergence with few iterations. The result is, to our knowledge, the most scalable and flexible linear learning system reported in the literature, beating standard practice with the current best system (Vowpal Wabbit and AllReduce). Using the KDD Cup 2012 data set from Tencent, Inc. we provide experimental results to verify the performance of this method.

artificial intelligence, inductive learning, machine learning, (19 more...)

arXiv.org Machine Learning

1209.0029

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry:

Information Technology (0.68)
Education > Educational Setting (0.48)
Media (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Hashing Algorithms for Large-Scale Learning

Li, Ping, Shrivastava, Anshumali, Moore, Joshua L., König, Arnd C.

Neural Information Processing SystemsDec-31-2011

Minwise hashing is a standard technique in the context of search for efficiently computing set similarities. The recent development of b-bit minwise hashing provides a substantial improvement by storing only the lowest b bits of each hashed value. In this paper, we demonstrate that b-bit minwise hashing can be naturally integrated with linear learning algorithms such as linear SVM and logistic regression, to solve large-scale and high-dimensional statistical learning tasks, especially when the data do not fit in memory. We compare $b$-bit minwise hashing with the Count-Min (CM) and Vowpal Wabbit (VW) algorithms, which have essentially the same variances as random projections. Our theoretical and empirical comparisons illustrate that b-bit minwise hashing is significantly more accurate (at the same storage cost) than VW (and random projections) for binary data.

artificial intelligence, b-bit minwise, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.68)
North America > Canada (0.46)

Genre: Research Report (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)

Add feedback

Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)

Li, Ping, Shrivastava, Anshumali, Konig, Christian

arXiv.org Machine LearningAug-15-2011

We generated a dataset of 200 GB with 10^9 features, to test our recent b-bit minwise hashing algorithms for training very large-scale logistic regression and SVM. The results confirm our prior work that, compared with the VW hashing algorithm (which has the same variance as random projections), b-bit minwise hashing is substantially more accurate at the same storage. For example, with merely 30 hashed values per data point, b-bit minwise hashing can achieve similar accuracies as VW with 2^14 hashed values per data point. We demonstrate that the preprocessing cost of b-bit minwise hashing is roughly on the same order of magnitude as the data loading time. Furthermore, by using a GPU, the preprocessing cost can be reduced to a small fraction of the data loading time. Minwise hashing has been widely used in industry, at least in the context of search. One reason for its popularity is that one can efficiently simulate permutations by (e.g.,) universal hashing. In other words, there is no need to store the permutation matrix. In this paper, we empirically verify this practice, by demonstrating that even using the simplest 2-universal hashing does not degrade the learning performance.

b-bit minwise, dataset, minwise, (15 more...)

arXiv.org Machine Learning

1108.3072

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(13 more...)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)

Add feedback